Semi-Supervised Learning for Imbalanced Sentiment Classification
نویسندگان
چکیده
Various semi-supervised learning methods have been proposed recently to solve the long-standing shortage problem of manually labeled data in sentiment classification. However, most existing studies assume the balance between negative and positive samples in both the labeled and unlabeled data, which may not be true in reality. In this paper, we investigate a more common case of semi-supervised learning for imbalanced sentiment classification. In particular, various random subspaces are dynamically generated to deal with the imbalanced class distribution problem. Evaluation across four domains shows the effectiveness of our approach.
منابع مشابه
Reserved Self-training: A Semi-supervised Sentiment Classification Method for Chinese Microblogs
The imbalanced sentiment distribution of microblogs induces bad performance of binary classifiers on the minority class. To address this problem, we present a semisupervised method for sentiment classification of Chinese microblogs. This method is similar to self-training, except that, a set of labeled samples is reserved for a confidence scores computing process through which samples that are ...
متن کاملWEMOTE - Word Embedding based Minority Oversampling Technique for Imbalanced Emotion and Sentiment Classification
Imbalanced training data always puzzles the supervised learning based emotion and sentiment classification. Several existing research showed that data sparseness and small disjuncts are the two major factors affecting the classification. Target to these two problems, this paper presents a word embedding based oversampling method. Firstly, a large-scale text corpus is used to train a continuous ...
متن کاملActive Deep Networks for Semi-Supervised Sentiment Classification
This paper presents a novel semisupervised learning algorithm called Active Deep Networks (ADN), to address the semi-supervised sentiment classification problem with active learning. First, we propose the semi-supervised learning method of ADN. ADN is constructed by Restricted Boltzmann Machines (RBM) with unsupervised learning using labeled data and abundant of unlabeled data. Then the constru...
متن کاملSentiment Analysis by Augmenting Expectation Maximisation with Lexical Knowledge
Sentiment analysis of documents aims to characterise the positive or negative sentiment expressed in documents. It has been formulated as a supervised classification problem, which requires large numbers of labelled documents. Semi-supervised sentiment classification using limited documents or words labelled with sentiment-polarities are approaches to reducing labelling cost for effective learn...
متن کاملSemi-supervised Learning for Sentiment Classification
With the growing need of identifying opinions and sentiments automatically from online text data, sentiment classification tasks have received considerable attention recently. One can treat sentiment classification as a text classification problem, however, it is very time-consuming and somewhat impractical to acquire enough labeled data to train a good sentiment classifier. This paper investig...
متن کامل